搜索资源列表
bow
- 一种机器学习的算法,用于统计语言模型,文本检索,分类和聚类的C语言源代码工具包-A Toolkit for Statistical Language Modeling, Text Retri , Classification and Clustering Bow (or libbow) is a library of C code useful for writing statistical text analysis, language modeling and information r
CosineSimilarAlgorithmzf
- 这里会用到TF/IDF权重,用余弦夹角计算文本相似度,用方差计算两个数据间欧式距离,用k-means进行数据聚类等数学和统计知识。-Here will use the TF/IDF weight, with cosine angle calculation of text similarity, with the variance of the two data between the data of the European distance, with K-means data cluste
julei
- 用weka工具实现文本的分类和聚类,包含有测试下所需要的文件-Weka with a Text tool classification and clustering, include files required for testing under
DataStructTest
- K-means文本聚类方法(IDEA项目包) 下载就能运行-K-means clustering method text (IDEA project package) will be able to download Run
TextSummarizer-master
- Document summarizer approach for the text document to do clustering and then do summarization
cluster
- 提出了一种基于语义内积空间模型的文本 聚类算法. -Text proposed clustering algorithm within the semantic model based on the product space.
jieba-master
- 主要用于Hadoop下的大数据的开发,文本分词,聚类算法的分析(The development of big data under Hadoop, text participle, clustering algorithm analysis)
CBIR-system
- 随着计算机科技的发展,图像检索的应用也越来越成熟,根据检索性质可分为两类:基于文本的图像检索和基于内容的图像检索。本论文通过研究基于内容的图像检索中的几个核心算法,用于聚类分析的K-means算法,通过haar小波变换来提取图像底层视觉特征,以及使用F-范数理论来进行相似性度量,来设计一个离线的图像检索系统。(With the development of computer technology, the application of image retrieval is more and mo
EnglishChuLi
- 利用python编写的文本预处理的程序,包含了每一步的实现代码,分为删除标点符号、删除停用词、相似度计算、PCA降维、聚类以及可视化等,运行环境为pytharm,python3开发环境(The text preprocessing program written by Python contains every step of implementation code, which is divided into delete punctuation marks, delete stop word
ChineseChuLi
- 中文文本处理的python程序,包括分词、删除特殊字符、删除停用词、爬虫程序、PCA降维、Kmean聚类、可视化等(Python programs for Chinese text processing, including participle, deleting special characters, deleting disuse words, crawler programs, PCA dimensionality reduction, Kmean clustering, visuali
文本分析聚类实战
- 文本挖掘是从大量的文本数据中抽取隐含的,求和的,可能有用的信息。 通过文本挖掘实现 ?Associate:关联分析,根据同时出现的频率找出关联规则 ?Cluster:将相似的文档(词条)进行聚类 ?Categorize:将文本划分到预先定义的类别里(Text mining is a kind of information that is extracted from a large number of text data, which may be useful. Implementa
情感分析用词语集·知网hownet词典
- 该软件可以实现微博分析、聊天分析、全网分析、网站分析、浏览分析、分词、词频统计、英文词频统计、流量分析、聚类分析等一系列文本分析(The software can realize micro-blog analysis, chat analysis, whole network analysis, website analysis, browse analysis, word segmentation, word frequency statistics, English word freque
kmeans
- jieba分词将中文文本进行分词处理,将分词后的结果使用word2vec转化成词向量,使用kmeans将中文文本进行聚类(Jieba participle segmenting Chinese text, transforming the result of word segmentation into word vector using word2vec, and clustering Chinese text using kmeans.)